An Analysis of a Combined Hardware-software Mechanism for Speculative Loads
نویسندگان
چکیده
This paper describes a simple hardware mechanism and related compiler support for software-controlled speculative loads. The compiler issues speculative load instructions based on anticipated data references and the ability of the memory system to hide memory latency in high-performance processors. The architectural support for such a mechanism is simple and minimal, yet handles faults gracefully. We have simulated three speculative load mechanisms based on a MIPS processor and a detailed memory system. The results of scientiic kernel loops indicate that speculative load techniques can hide memory latency eeectively.
منابع مشابه
Architecture-Compatible Code Boosting for Performance Enhancement of the IBM RS/6000
are four main areas in which we see opportunities for future work. The first is in measuring the effect of hardware extensions to our current machine model for supporting unsafe code boosting. The second is implementing a software mechanism similar to those proposed by Bernstein et al. [10] for proving the safety of speculative loads and measuring its impact on performance. The third is augment...
متن کاملPerformance Evaluation of Configurable Hardware Features on the AMD-K5
Many modern processors incorporate certain configurable hardware features, although these features are never publicized. For instance, the AMD-K5 incorporates the ability to disable branch prediction, put caches into write allocate mode, etc. The ability to configure the features by software combined with the availability of on-chip performance counters allow the direct measurement of the perfo...
متن کاملExploring Thread-Level Speculation in Software: The Effects of Memory Access Tracking Granularity
Speculative execution is often the only way to overcome dataflow-imposed limitations and exploit parallelism when dependences can be discovered only at run-time. It also facilitates automatic parallelization of programs that exhibit complicated memory access patterns, which make complete compile-time dependence analysis either impossible or extremely complicated. A number of approaches for coar...
متن کاملA Chip-Multiprocessor Architecture with Speculative Multithreading
ÐMuch emphasis is now placed on chip-multiprocessor (CMP) architectures for exploiting thread-level parallelism in an application. In such architectures, speculation may be employed to execute applications that cannot be parallelized statically. In this paper, we present an efficient CMP architecture for speculative execution of sequential binaries without source recompilation. We present the s...
متن کاملFranklin and Sohi : Arb - a Hardware Mechanism for Dynamic Reordering of Memory
To exploit instruction level parallelism, it is important not only to execute multiple memory references per cycle, but also to reorder memory references-especially to execute loads before stores that precede them in the sequential instruction stream. To guarantee correctness of execution in such situations, memory reference addresses have to be disambiguated. This paper presents a novel hardwa...
متن کامل